-
-
Notifications
You must be signed in to change notification settings - Fork 61
feat(eap): Add a column to store arrays in EAP #7493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This PR has a migration; here is the generated SQL for -- start migrations
-- forward migration events_analytics_platform : 0050_add_attributes_array_column
Local op: ALTER TABLE eap_items_1_local ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Distributed op: ALTER TABLE eap_items_1_dist ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Local op: ALTER TABLE eap_items_1_downsample_8_local ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Distributed op: ALTER TABLE eap_items_1_downsample_8_dist ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Local op: ALTER TABLE eap_items_1_downsample_64_local ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Distributed op: ALTER TABLE eap_items_1_downsample_64_dist ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Local op: ALTER TABLE eap_items_1_downsample_512_local ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
Distributed op: ALTER TABLE eap_items_1_downsample_512_dist ADD COLUMN IF NOT EXISTS attributes_array JSON(max_dynamic_paths=128) CODEC (ZSTD(1)) AFTER attributes_float_39;
-- end forward migration events_analytics_platform : 0050_add_attributes_array_column
-- backward migration events_analytics_platform : 0050_add_attributes_array_column
Distributed op: ALTER TABLE eap_items_1_dist DROP COLUMN IF EXISTS attributes_array;
Local op: ALTER TABLE eap_items_1_local DROP COLUMN IF EXISTS attributes_array;
Distributed op: ALTER TABLE eap_items_1_downsample_8_dist DROP COLUMN IF EXISTS attributes_array;
Local op: ALTER TABLE eap_items_1_downsample_8_local DROP COLUMN IF EXISTS attributes_array;
Distributed op: ALTER TABLE eap_items_1_downsample_64_dist DROP COLUMN IF EXISTS attributes_array;
Local op: ALTER TABLE eap_items_1_downsample_64_local DROP COLUMN IF EXISTS attributes_array;
Distributed op: ALTER TABLE eap_items_1_downsample_512_dist DROP COLUMN IF EXISTS attributes_array;
Local op: ALTER TABLE eap_items_1_downsample_512_local DROP COLUMN IF EXISTS attributes_array;
-- end backward migration events_analytics_platform : 0050_add_attributes_array_column |
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
| Column( | ||
| "attributes_array", | ||
| JSON( | ||
| max_dynamic_paths=4096, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this number is way too high. this would create an additional 4096 columns on each table, almost 14000 columns in total. and I'm fairly certain we will reach it. All it would take is 4096 differently named values across our entire customer base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend putting it quite low. like 32 and have the shared structure take care of the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is 1024, ClickHouse is OK having a higher number of columns. I'm not sure we need to keep a very small number like 32.
We want to take advantage of the shared structure but we'll need to upgrade to 25.8 first. At least, we should keep the default value.
We want to start storing array values in EAP, which could have multiple types (arrays of ints, arrays of floats, etc). This would use the new JSON column type in ClickHouse to help us store arrays of various types.